Linearly Interpolated Hierarchical N-gram Language Models for Speech Recognition Engines
نویسندگان
چکیده
Language modeling is a crucial component in natural language continuous speech recognition, due to the difficulty involved by continuous speech [1], [2]. Language modeling attempts to capture regularities in natural language for the purpose of improving the recognition performance. Many studies have shown that the word error rate of automatic speech recognition (ASR) systems decreases significantly when using statistical language models [3], [4], [5]. The purpose of language models (LMs) is to compute the probability of a sequence of words . The probability can be expressed as: , where is the history or the context of word . The probability becomes difficult to estimate as the number of words in increases. To overcome this problem, we can introduce equivalent classes on the histories in order to reduce their cardinality. The n-gram language models approximate the dependence of each word (regardless of i) to the n — 1 words preceding it: . The probability can then be expressed as:
منابع مشابه
Efficient estimation of maximum entropy language models with n-gram features: an SRILM extension
We present an extension to the SRILM toolkit for training maximum entropy language models with N -gram features. The extension uses a hierarchical parameter estimation procedure [1] for making the training time and memory consumption feasible for moderately large training data (hundreds of millions of words). Experiments on two speech recognition tasks indicate that the models trained with our ...
متن کاملImproving trigram language modeling with the World Wide Web
We propose a novel method for using the World Wide Web to acquire trigram estimates for statistical language modeling. We submit an N-gram as a phrase query to web search engines. The search engines return the number of web pages containing the phrase, from which the N-gram count is estimated. The N-gram counts are then used to form web-based trigram probability estimates. We discuss the proper...
متن کاملFitting long-range information using interpolated distanced n-grams and cache models into a latent dirichlet language model for speech recognition
We propose a language modeling (LM) approach using interpolated distanced n-grams into a latent Dirichlet language model (LDLM) [1] for speech recognition. The LDLM relaxes the bag-of-words assumption and document topic extraction of latent Dirichlet allocation (LDA). It uses default background ngrams where topic information is extracted from the (n-1) history words through Dirichlet distributi...
متن کاملInterpolated Dirichlet Class Language Model for Speech Recognition Incorporating Long-distance N-grams
We propose a language modeling (LM) approach incorporating interpolated distanced n-grams in a Dirichlet class language model (DCLM) (Chien and Chueh, 2011) for speech recognition. The DCLM relaxes the bag-of-words assumption and documents topic extraction of latent Dirichlet allocation (LDA). The latent variable of DCLM reflects the class information of an n-gram event rather than the topic in...
متن کاملSimilar n-gram language model
This paper describes an extension of the n-gram language model: the similar n-gram language model. The estimation of the probability P (s) of a string s by the classical model of order n is computed using statistics of occurrences of the last nwords of the string in the corpus, whereas the proposed model further uses all the strings s′ for which the Levenshtein distance to s is smaller than a g...
متن کامل